CDB_cycles_AnalysisOfParameters

In this notebook, we are studying the different parameters of our method Complexity Driven Bagging so as to offer a range of selection to the final user. In particular, we have three parameters:

Besides these 3 parameters, we have obtained results for different complexity measures. For the analysis of the parameters, we have aggregated results over the different complexity measures.

First, we are studying, for each one of the parameters (alpha, split, number of cycles) independently, for which values there are no significant differences and, thus, can be eliminated from the range of recommended values.

In particular:

After this analysis, we will have a first range recommendation for every parameter. Notice that, in all cases, we take into account the mean, median and standard deviation of the accuracy.

Parameter analysis

Mean, median and standard deviation of accuracy for all levels of split

table_split <- datos %>%
  group_by(split) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split)
split mean median std
1 0.8110460 0.7967871 0.1212139
2 0.8125088 0.7959839 0.1212647
4 0.8129489 0.7946452 0.1213289
6 0.8131103 0.7943984 0.1213001
8 0.8133743 0.7939779 0.1211680
10 0.8134900 0.7944709 0.1211299
12 0.8135405 0.7943101 0.1210803
14 0.8136284 0.7938616 0.1211058
16 0.8137780 0.7937970 0.1209858
18 0.8137304 0.7939759 0.1210338
20 0.8138354 0.7942397 0.1210338
22 0.8136747 0.7941315 0.1210665
24 0.8139666 0.7938021 0.1209256
26 0.8138905 0.7943179 0.1210517
28 0.8140333 0.7946578 0.1208977
30 0.8140107 0.7944929 0.1209749

The higher the value of split, the higher the mean (with some exceptions) of accuracy, the lower the median and the lower the standard deviation. ¿Medium-low split values?

If we compare if there are significant differences among the different split values (once aggregated per n_cycle). We obtain that:

For the mean of the accuracy, there are no significant differences among:

  • 4 with 6, 6 with 8 and 12

  • 8 with 12,14,16

  • 10 with 12, 14, 18, 22

  • 12 with 14, 16, 18, 20, 22

  • 14 with 16, 18, 20, 22, 26

  • From 16 to 30, almost all comparisons are not significantly different –> maximum value of split should be 16

For the median of the accuracy, there are no significant differences among:

  • 4 with 6 and 10

  • 6 with 8, 10, 12

  • 8 with 10, 12, 14, 16, 18, 20, 22

  • 10 with 12, 14, 16, 18, 20, 22

  • 12 with 14, 16, 18, 20, 22

  • 14 with 16, 18, 20, 22, 26

  • From 16 to 30, almost all comparisons are not significantly different –> maximum value of split should be 16

For the std of the accuracy, there are no significant differences among:

  • 4 with 6 and 8

  • 6 with 8, 10, 12

  • 8 with 10, 12, 14, 16, 20

  • 10 with 12, 14, 16, 20, 22, 30

  • 12 with 14, 16, 20, 22, 26, 30

  • From 14 to 30, almost all comparisons are not significantly different –> maximum value of split should be 14

Mean, median and standard deviation of accuracy for all levels of alpha

table_alpha <- datos %>%
  group_by(alpha) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_alpha)
alpha mean median std
2 0.8129428 0.7957598 0.1211778
4 0.8129658 0.7958410 0.1212035
6 0.8128979 0.7959097 0.1210609
8 0.8127810 0.7955892 0.1210810
10 0.8126166 0.7957162 0.1210488
12 0.8123364 0.7958433 0.1212849
14 0.8123072 0.7956720 0.1213272
16 0.8121494 0.7951638 0.1213054
18 0.8123956 0.7953308 0.1211186
20 0.8121692 0.7953601 0.1213016

The higher the value of alpha, the lower the mean and the median of accuracy. The standard deviation keeps lower for low-medium values. –> Low-medium values of alpha. Lower than 12.

If we compare if there are significant differences among the different alpha values (once aggregated per n_cycle). We obtain that:

For the mean of the accuracy, there are ONLY significant differences among:

  • 2 with 10

  • 10 with 12, 14, 16, 18, 20

For the median of the accuracy, there are ONLY significant differences among:

  • 2 with 4, 6, 8, 10, 14

  • 10 with 12, 16, 20

For the std of the accuracy, there are NO significant differences among:

  • 4 with 6 and 8

  • 6 with 8, 10, 12

  • 8 with 10, 12

  • From 10 to 20, almost all comparisons are not significantly different –> maximum value of alpha should be 10

Mean, median and standard deviation of accuracy for all levels of n_cycles (for some split values)

We cannot perform a summary of ‘n_cycle’ in general because the number of cycles depends on the value of split. Thus, we show some cases.

split = 1

table_split1 <- datos %>% filter(split == 1) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
#knitr::kable(table_split1)

#datatable(table_split1)
## solo para html
#library(DT)

## Crear una tabla interactiva con paginación
#datatable(table_split1, 
#          options = list(pageLength = 15, # Muestra 15 filas por página
#                         lengthMenu = c(15, 30, 50, 100), # Opciones de filas por página
#                         autoWidth = TRUE))

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy clearly stabilizes and there is no always a clear increase over time. For example, results with 89 cycles are better than with 100.

split = 2

table_split2 <- datos %>% filter(split == 2) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split2)
n_cycle mean median std
1 0.7878423 0.7756056 0.1302345
2 0.7978929 0.7832387 0.1268106
3 0.8042211 0.7884107 0.1244836
4 0.8062211 0.7895648 0.1237105
5 0.8082450 0.7907575 0.1228781
6 0.8090127 0.7913615 0.1227202
7 0.8099046 0.7923985 0.1223956
8 0.8106248 0.7931522 0.1220386
9 0.8109993 0.7932107 0.1219353
10 0.8113641 0.7935095 0.1217677
11 0.8117868 0.7939090 0.1215668
12 0.8118698 0.7943144 0.1215419
13 0.8121306 0.7944254 0.1214045
14 0.8123525 0.7948739 0.1213392
15 0.8126514 0.7951106 0.1212693
16 0.8126805 0.7951063 0.1212591
17 0.8128990 0.7955438 0.1212589
18 0.8130042 0.7956286 0.1212281
19 0.8131348 0.7956102 0.1212161
20 0.8131206 0.7961042 0.1211763
21 0.8134105 0.7959170 0.1209994
22 0.8133977 0.7958501 0.1210053
23 0.8135931 0.7961845 0.1208718
24 0.8136424 0.7961932 0.1208730
25 0.8137241 0.7960968 0.1207911
26 0.8137442 0.7965650 0.1207932
27 0.8138485 0.7960773 0.1207203
28 0.8138378 0.7967582 0.1207137
29 0.8138301 0.7968033 0.1207834
30 0.8138605 0.7967202 0.1207607
31 0.8139992 0.7967785 0.1207395
32 0.8140961 0.7973302 0.1206660
33 0.8140979 0.7971923 0.1206819
34 0.8141011 0.7966902 0.1206659
35 0.8141745 0.7969440 0.1206125
36 0.8142054 0.7966909 0.1206157
37 0.8142043 0.7964537 0.1206885
38 0.8142868 0.7967746 0.1206271
39 0.8143627 0.7965885 0.1205761
40 0.8143561 0.7967773 0.1205673
41 0.8143234 0.7965640 0.1205901
42 0.8143553 0.7965451 0.1205912
43 0.8143429 0.7965205 0.1205719
44 0.8143546 0.7967322 0.1205887
45 0.8143827 0.7968197 0.1205847
46 0.8144383 0.7971508 0.1205513
47 0.8144575 0.7969159 0.1205463
48 0.8144753 0.7973029 0.1205636
49 0.8145061 0.7972868 0.1205463
50 0.8145415 0.7973226 0.1205443
51 0.8145303 0.7973289 0.1205242
52 0.8145728 0.7972118 0.1205605
53 0.8145070 0.7970549 0.1205820
54 0.8145590 0.7973896 0.1205244
55 0.8145553 0.7971276 0.1205232
56 0.8145220 0.7969456 0.1206034
57 0.8145558 0.7968712 0.1205985
58 0.8145750 0.7969573 0.1205827
59 0.8146399 0.7969873 0.1205830
60 0.8146042 0.7972197 0.1206055

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy clearly stabilizes and there is no always a clear increase over time.

split = 4

table_split4 <- datos %>% filter(split == 4) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split4)
n_cycle mean median std
1 0.7985992 0.7809906 0.1267342
2 0.8052539 0.7865956 0.1245487
3 0.8087263 0.7898662 0.1231415
4 0.8101471 0.7921213 0.1225476
5 0.8111221 0.7922317 0.1222535
6 0.8117260 0.7930016 0.1220022
7 0.8121662 0.7937643 0.1217520
8 0.8126286 0.7929885 0.1215884
9 0.8129014 0.7941532 0.1214551
10 0.8131537 0.7937974 0.1213373
11 0.8133987 0.7939763 0.1212369
12 0.8135283 0.7943581 0.1211784
13 0.8137883 0.7944978 0.1209563
14 0.8138608 0.7946963 0.1210106
15 0.8138553 0.7947821 0.1210093
16 0.8139686 0.7950329 0.1209964
17 0.8141007 0.7950047 0.1209703
18 0.8141282 0.7952039 0.1208968
19 0.8141507 0.7951631 0.1209582
20 0.8142332 0.7949935 0.1209015
21 0.8142853 0.7950799 0.1208754
22 0.8143584 0.7952926 0.1208503
23 0.8144006 0.7951492 0.1208439
24 0.8144707 0.7953898 0.1208550
25 0.8145182 0.7955021 0.1207992
26 0.8146189 0.7955435 0.1207415
27 0.8146431 0.7958857 0.1207559
28 0.8146555 0.7957141 0.1207142
29 0.8146843 0.7959362 0.1206945
30 0.8147846 0.7955836 0.1206679
31 0.8148139 0.7954345 0.1206182
32 0.8148141 0.7953640 0.1206461
33 0.8148755 0.7955594 0.1206363
34 0.8149022 0.7955458 0.1206320

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy stabilizes but keeps showing an increasing trend. The longest the cycle, the less stable the trend (still increasing).

split = 10

table_split10 <- datos %>% filter(split == 10) %>%
  group_by(n_cycle) %>%
  summarise_at(vars(accuracy_mean_mean),  list(mean = mean, median = median, std = sd))
knitr::kable(table_split10)
n_cycle mean median std
1 0.8071858 0.7880451 0.1237565
2 0.8108445 0.7918125 0.1223758
3 0.8124787 0.7937626 0.1217319
4 0.8132270 0.7943083 0.1214067
5 0.8136229 0.7948287 0.1211650
6 0.8138476 0.7946337 0.1210393
7 0.8141833 0.7949090 0.1208817
8 0.8142894 0.7959271 0.1209074
9 0.8144666 0.7951143 0.1208267
10 0.8145978 0.7958262 0.1207563
11 0.8146281 0.7947367 0.1207611
12 0.8146653 0.7944111 0.1207817
13 0.8147261 0.7949773 0.1206724
14 0.8147200 0.7950719 0.1207383
15 0.8148667 0.7955196 0.1206696

The higher the number of cycles, the higher the mean, median of accuracy and the lower the standard deviation. For high values of cycles, the accuracy stabilizes but keeps showing an increasing trend. The longest the cycle, the less stable the trend (still increasing).

Number of cycles

# Tenemos que hacer el análisis para cada combo_alpha_split
valores_combo = levels(datos$combo_alpha_split)
n_combo = length(valores_combo)
combo_friedman = data.frame(valores_combo)
combo_friedman$p_value = rep(NA,n_combo)

for (i in valores_combo){
  #print(i)
  datos_i = datos[datos$combo_alpha_split==i,]
  fri = friedman.test(accuracy_mean_mean ~ n_cycle |Dataset, data=as.matrix(datos_i))
  combo_friedman[combo_friedman$valores_combo==i,2] = fri$p.value
}
combo_friedman[combo_friedman$p_value> 0.05]
data frame with 0 columns and 160 rows
# es decir, en todos los casos hay diferencias significativas

Once we have checked that there are significant differences between at least one value in the combo, we make multiple comparisons to analyze when adding another cycle is not worthy since the increase is not significant.

dif_no_sig <- data.frame(valores_combo)
dif_no_sig$niveles = rep(NA,n_combo)

# Lo dejamos en comentarios porque tarda mucho

# for (i in valores_combo){
#   print(i)
#   datos_i = datos[datos$combo_alpha_split==i,]
#   datos_i$n_cycle <- factor(datos_i$n_cycle) # los niveles del factor cambian en cada subset
#   pwc2 <- datos_i %>% 
#     wilcox_test(accuracy_mean_mean ~ n_cycle, paired = TRUE, p.adjust.method = "bonferroni")
#   # Filtrar comparaciones con diferencias no significativas (suponiendo un umbral de p > 0.05)
#   no_significativas <- pwc2[pwc2$p.adj>0.1,]
# 
#   
#   # si no todas las comparaciones con ese nivel son no significativas, lo quitamos 
#   # es decir, no nos vale que solo no haya diferencia entre 3 y 5 y con el resto (3-6,3-7,etc) sí
#   max_cycles = max(as.numeric(pwc2$group2))
#   valores_check <- unique(as.numeric(no_significativas$group1))
#   for (v in valores_check){
#     if (sum(no_significativas$group1 == v) <(max_cycles - v) ){
#       no_significativas = no_significativas[no_significativas$group1!=v,]
#     }
#   }
#   
#   # Extraer los niveles de los pares con diferencias no significativas
#   niveles_no_significativos <- unique(c(no_significativas$group1, no_significativas$group2))
# 
#   dif_no_sig[dif_no_sig$valores_combo==i,2] = paste(niveles_no_significativos, collapse = ", ")
# }

#write.csv(dif_no_sig, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean.csv")

In this dataframe we have, for every combination of alpha and split, the number of cycles with no significant difference between all of them.

dif_no_sig_mean <- read.csv('CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean.csv') 
head(dif_no_sig_mean)
  X   valores_combo
1 1  alpha10-split1
2 2 alpha10-split10
3 3 alpha10-split12
4 4 alpha10-split14
5 5 alpha10-split16
6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                                      niveles
1 18, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 19, 20, 100
2                                                                                                                                                                                                                                                                                                       5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                        4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                               5, 6, 7, 8, 9

Let’s relate that with the number of models to have a first view of where to stop adding models. We create two different columns:

  • num_models: it relates the first (minimum) number of cycles that presents no significant differences with all its consecutive number of cycles with the number of models it implies.

  • num_models2: it is exactly the same concept as num_models but with the second value of number of cycles in case we want to be more conservative

# Variables to character
dif_no_sig_mean$niveles <- as.character(dif_no_sig_mean$niveles)
dif_no_sig_mean$valores_combo <- as.character(dif_no_sig_mean$valores_combo)

# Order the values 
dif_no_sig_mean$niveles <- sapply(strsplit(dif_no_sig_mean$niveles, ", "), function(x) {
  paste(sort(as.numeric(x)), collapse = ", ")
})

# Extraer el valor numérico después de "split" en la columna B
dif_no_sig_mean$valor_split <- as.numeric(gsub(".*split", "", dif_no_sig_mean$valores_combo))

# New columns with number of models
dif_no_sig_mean$num_models <- mapply(function(a, b) {
  min(as.numeric(strsplit(a, ", ")[[1]])) * (2*b +1)
}, dif_no_sig_mean$niveles, dif_no_sig_mean$valor_split)

# New columns with number of models (for the second value)
dif_no_sig_mean$num_models2 <- mapply(function(a, b) {
  valores <- sort(as.numeric(strsplit(a, ", ")[[1]])) 
  segundo_min <- ifelse(length(valores) > 1, valores[2], valores[1])  # Obtener el segundo mínimo o el primero si hay solo uno
  segundo_min * (2*b +1)
}, dif_no_sig_mean$niveles, dif_no_sig_mean$valor_split)

head(dif_no_sig_mean)
  X   valores_combo
1 1  alpha10-split1
2 2 alpha10-split10
3 3 alpha10-split12
4 4 alpha10-split14
5 5 alpha10-split16
6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                                      niveles
1 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                       5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                        4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                               5, 6, 7, 8, 9
  valor_split num_models num_models2
1           1         54          57
2          10        105         126
3          12        125         150
4          14         87         116
5          16        132         165
6          18        185         222

We perform the same analysis for the median and the standard deviation.

For the median:

# dif_no_sig_mean$niveles_mediana = rep(NA,n_combo)
# 
# # Lo dejamos en comentarios porque tarda mucho
# 
# for (i in valores_combo){
#   print(i)
#   datos_i = datos[datos$combo_alpha_split==i,]
#   datos_i$n_cycle <- factor(datos_i$n_cycle) # los niveles del factor cambian en cada subset
#   pwc2 <- datos_i %>%
#     wilcox_test(accuracy_mean_median ~ n_cycle, paired = TRUE, p.adjust.method = "bonferroni")
#   # Filtrar comparaciones con diferencias no significativas (suponiendo un umbral de p > 0.05)
#   no_significativas <- pwc2[pwc2$p.adj>0.1,]
# 
# 
#   # si no todas las comparaciones con ese nivel son no significativas, lo quitamos
#   # es decir, no nos vale que solo no haya diferencia entre 3 y 5 y con el resto (3-6,3-7,etc) sí
#   max_cycles = max(as.numeric(pwc2$group2))
#   valores_check <- unique(as.numeric(no_significativas$group1))
#   for (v in valores_check){
#     if (sum(no_significativas$group1 == v) <(max_cycles - v) ){
#       no_significativas = no_significativas[no_significativas$group1!=v,]
#     }
#   }
# 
#   # Extraer los niveles de los pares con diferencias no significativas
#   niveles_no_significativos <- unique(c(no_significativas$group1, no_significativas$group2))
# 
#   dif_no_sig_mean[dif_no_sig_mean$valores_combo==i,'niveles_mediana'] = paste(niveles_no_significativos, collapse = ", ")
# }

#write.csv(dif_no_sig_mean, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median.csv")

For the standard deviation:

# dif_no_sig_mean$niveles_std = rep(NA,n_combo)
# 
# # Lo dejamos en comentarios porque tarda mucho
# 
# for (i in valores_combo){
#   print(i)
#   datos_i = datos[datos$combo_alpha_split==i,]
#   datos_i$n_cycle <- factor(datos_i$n_cycle) # los niveles del factor cambian en cada subset
#   pwc2 <- datos_i %>%
#     wilcox_test(accuracy_mean_std ~ n_cycle, paired = TRUE, p.adjust.method = "bonferroni")
#   # Filtrar comparaciones con diferencias no significativas (suponiendo un umbral de p > 0.05)
#   no_significativas <- pwc2[pwc2$p.adj>0.1,]
# 
# 
#   # si no todas las comparaciones con ese nivel son no significativas, lo quitamos
#   # es decir, no nos vale que solo no haya diferencia entre 3 y 5 y con el resto (3-6,3-7,etc) sí
#   max_cycles = max(as.numeric(pwc2$group2))
#   valores_check <- unique(as.numeric(no_significativas$group1))
#   for (v in valores_check){
#     if (sum(no_significativas$group1 == v) <(max_cycles - v) ){
#       no_significativas = no_significativas[no_significativas$group1!=v,]
#     }
#   }
# 
#   # Extraer los niveles de los pares con diferencias no significativas
#   niveles_no_significativos <- unique(c(no_significativas$group1, no_significativas$group2))
# 
#   dif_no_sig_mean[dif_no_sig_mean$valores_combo==i,'niveles_std'] = paste(niveles_no_significativos, collapse = ", ")
# }

#write.csv(dif_no_sig_mean, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std.csv")

Now we relate the number of cycles with the number of ensembles for all statistical measures (mean, median, std):

dif_no_sig_all <- read.csv('CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std.csv') 
head(dif_no_sig_all)
  X.1 X   valores_combo
1   1 1  alpha10-split1
2   2 2 alpha10-split10
3   3 3 alpha10-split12
4   4 4 alpha10-split14
5   5 5 alpha10-split16
6   6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                                      niveles
1 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                       5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                        4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                               5, 6, 7, 8, 9
  valor_split num_models num_models2
1           1         54          57
2          10        105         126
3          12        125         150
4          14         87         116
5          16        132         165
6          18        185         222
                                                                                                                                                                                                                                                                                                                          niveles_mediana
1 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                         3, 5, 6, 7, 8, 9, 10, 11, 4, 12
4                                                                                                                                                                                                                                                                                                          2, 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                 3, 4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9
                                                                                                                                                                                                                                                                                                                                                                           niveles_std
1 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                                                                5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                                                            5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                                                          3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                                                                    5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                                                                  3, 4, 5, 6, 7, 8, 9
# Variables to character
dif_no_sig_all$niveles_mediana <- as.character(dif_no_sig_all$niveles_mediana)
dif_no_sig_all$niveles_std <- as.character(dif_no_sig_all$niveles_std)
dif_no_sig_all$valores_combo <- as.character(dif_no_sig_all$valores_combo)

# Order the values 
dif_no_sig_all$niveles_mediana <- sapply(strsplit(dif_no_sig_all$niveles_mediana, ", "), function(x) {
  paste(sort(as.numeric(x)), collapse = ", ")
})

dif_no_sig_all$niveles_std <- sapply(strsplit(dif_no_sig_all$niveles_std, ", "), function(x) {
  paste(sort(as.numeric(x)), collapse = ", ")
})


# New columns with number of models
dif_no_sig_all$num_models_mediana <- mapply(function(a, b) {
  min(as.numeric(strsplit(a, ", ")[[1]])) * (2*b +1)
}, dif_no_sig_all$niveles_mediana, dif_no_sig_all$valor_split)

dif_no_sig_all$num_models_std <- mapply(function(a, b) {
  min(as.numeric(strsplit(a, ", ")[[1]])) * (2*b +1)
}, dif_no_sig_all$niveles_std, dif_no_sig_all$valor_split)

# New columns with number of models (for the second value)
dif_no_sig_all$num_models2_mediana <- mapply(function(a, b) {
  valores <- sort(as.numeric(strsplit(a, ", ")[[1]])) 
  segundo_min <- ifelse(length(valores) > 1, valores[2], valores[1])  # Obtener el segundo mínimo o el primero si hay solo uno
  segundo_min * (2*b +1)
}, dif_no_sig_all$niveles_mediana, dif_no_sig_all$valor_split)

dif_no_sig_all$num_models2_std <- mapply(function(a, b) {
  valores <- sort(as.numeric(strsplit(a, ", ")[[1]])) 
  segundo_min <- ifelse(length(valores) > 1, valores[2], valores[1])  # Obtener el segundo mínimo o el primero si hay solo uno
  segundo_min * (2*b +1)
}, dif_no_sig_all$niveles_std, dif_no_sig_all$valor_split)

# Sacamos tb el valor en ciclos
dif_no_sig_all$cycles_mean <- sapply(strsplit(dif_no_sig_all$niveles, ", "), function(x) {
  min(as.numeric(x))})

dif_no_sig_all$cycles_median <- sapply(strsplit(dif_no_sig_all$niveles_mediana, ", "), function(x) {
  min(as.numeric(x))})

dif_no_sig_all$cycles_std <- sapply(strsplit(dif_no_sig_all$niveles_std, ", "), function(x) {
  min(as.numeric(x))})


head(dif_no_sig_all)
  X.1 X   valores_combo
1   1 1  alpha10-split1
2   2 2 alpha10-split10
3   3 3 alpha10-split12
4   4 4 alpha10-split14
5   5 5 alpha10-split16
6   6 6 alpha10-split18
                                                                                                                                                                                                                                                                                                                                      niveles
1 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                       5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                   5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                        4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                               5, 6, 7, 8, 9
  valor_split num_models num_models2
1           1         54          57
2          10        105         126
3          12        125         150
4          14         87         116
5          16        132         165
6          18        185         222
                                                                                                                                                                                                                                                                                                                          niveles_mediana
1 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                         3, 4, 5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                          2, 3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                 3, 4, 5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                     3, 4, 5, 6, 7, 8, 9
                                                                                                                                                                                                                                                                                                                                                                           niveles_std
1 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
2                                                                                                                                                                                                                                                                                                                                                5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
3                                                                                                                                                                                                                                                                                                                                                            5, 6, 7, 8, 9, 10, 11, 12
4                                                                                                                                                                                                                                                                                                                                                          3, 4, 5, 6, 7, 8, 9, 10, 11
5                                                                                                                                                                                                                                                                                                                                                                    5, 6, 7, 8, 9, 10
6                                                                                                                                                                                                                                                                                                                                                                  3, 4, 5, 6, 7, 8, 9
  num_models_mediana num_models_std num_models2_mediana num_models2_std
1                 57             21                  60              24
2                 84            105                 105             126
3                 75            125                 100             150
4                 58             87                  87             116
5                 99            165                 132             198
6                111            111                 148             148
  cycles_mean cycles_median cycles_std
1          18            19          7
2           5             4          5
3           5             3          5
4           3             2          3
5           4             3          5
6           5             3          3
#write.csv(dif_no_sig_all, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std_num_models.csv")

We have performed multiple comparisons analysis for the number of cycles with respect to the mean, median and std of the accuracy and found the number of cycles above which there is no significant difference for each statistical measure. Now we have to take the maximum number of cycles over the three statistical measure to select a range of cycles. After that we can analyze which is the best cycle structure (for example, a high number of short cycles or a short number of long cycles).

dif_no_sig_all$max_num_cycles <- apply(X=dif_no_sig_all[,c('cycles_mean','cycles_median','cycles_std')], MARGIN=1, FUN=max)
dif_no_sig_all$max_num_models <- apply(X=dif_no_sig_all[,c('num_models','num_models_mediana','num_models_std')], MARGIN=1, FUN=max)

#write.csv(dif_no_sig_all, "CDB_cycles_ParametersComboAlphaSplit_dif_no_signif_cycles_mean_median_std_num_models.csv")

If we analyze the number of models above which there are no significant different, we can see that the maximum value is 294 and quartile 75% is 175, implying that our maximum tested number of models (300) is ok and that lower number of models in an ensemble can obtain competitive accuracy results.

p<-ggplot(dif_no_sig_all, aes(x=max_num_models)) + 
  geom_histogram(color="black", fill="white")
p

summary(dif_no_sig_all$max_num_models)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   55.0   117.0   147.0   147.7   175.5   294.0 

From the study of alpha and split we know that:

  • From 16 to 30, almost all comparisons are not significantly different –> maximum value of split should be 16. Domain: [1, 2, 4, 6, 8, 10, 12, 14]

  • Maximum alpha value should be 10-12: Domain: [2, 4, 6, 8, 10]

Note that these ranges go inline with the previous study where we make no distinction about cycles.

Let’s filter now the previous information according to these ranges:

dif_no_sig_all$valor_alpha <- as.numeric(gsub("alpha([0-9]+)-split[0-9]+", "\\1", dif_no_sig_all$valores_combo))
# Filtrar el dataset para eliminar las filas donde alpha > 12 y split > 16
df_filtered <- dif_no_sig_all[(dif_no_sig_all$valor_alpha < 12 & dif_no_sig_all$valor_split < 16), ]
summary(df_filtered$max_num_models)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  55.00   89.25  116.00  112.67  125.00  200.00 

After filtering the not desired values for alpha and split, the maximum number of models where the significant differences stop is 200.

Este df_filtered son 40 filas, es decir, 40 combinaciones de alpha y split. Entonces vamos a estudiarlos 1 a 1. Hago un gráfico de la evolución del accuracy en cada caso y señalo cuando se supone que no hay diferencias significativas. Esto lo hacemos de nuevo agregado por medidas de complejidad y luego elijo un par y lo visualizo para ellas. Aquí estoy un poco perdida porque en general se ve que según aumenta el número de modelos, aumenta el accuracy pero supuestamente ya no de forma significativa y deberíamos guiarnos por eso. Yo creo que el orden es:

  1. Hacer estos gráficos de evolución del accuracy para los 40 casos y mirarlo para alguna medida de complejidad

  2. Seleccionar el mejor valor de accuracy para estos 40 casos

  3. Sacar conclusiones de la comparación de 1 y 2

  4. Para nuestro método, escoger el mejor valor de parámetros en 2 casos:

    1. alpha, split y n_cycles reducido en función de las comparaciones múltiples

    2. alpha y split reducido en función de las comparaciones múltiples y n_cycles sin reducir

    Comparar estas 2 versiones nuestras standard bagging y mixed bagging con el mismo número de parámetros

datos_alpha_s <- datos %>% filter(alpha<12, split <16) 
datos_alpha_s <- datos_alpha_s %>% group_by(alpha, split, n_cycle, n_ensemble) %>%
  summarise_at(vars(accuracy_mean_mean),  list(accuracy_mean_dataset_mean = mean, accuracy_mean_dataset_median = median, accuracy_mean_dataset_std = sd))

For each combination of alpha and split we are plotting, with an orange dot, the maximum accuracy achieved and, with a blue dot, the number of ensembles from which there are no significant differences. In this case, the first thing to outline is that the orange point is not achieved at the maximum number of models tried (aroung 300), meaning that not always more models imply better performance. The blue dot appears quite before.

datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha==2, split==1)
datos_alpha_s_1 <- as.data.frame(datos_alpha_s_1)
datos_alpha_s_1$n_cycle <- as.numeric(as.character(datos_alpha_s_1$n_cycle))
datos_alpha_s_1$n_ensemble <- as.numeric(as.character(datos_alpha_s_1$n_ensemble))

idmax = which.max(datos_alpha_s_1$accuracy_mean_dataset_mean)
# max(datos_alpha_s_1$accuracy_mean_dataset_mean)
max_acc_ensemble = datos_alpha_s_1[idmax,'n_ensemble']
max_signifi = dif_no_sig_all[(dif_no_sig_all$valor_alpha == 2) & (dif_no_sig_all$valor_split == 1),'max_num_models'] 
# datos_alpha_s_1[datos_alpha_s_1$n_ensemble==max_signifi,'accuracy_mean_dataset_mean']


plot(datos_alpha_s_1$n_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean, type='l', xlab='n ensembles', ylab = 'accuracy mean', main ='alpha = 2, split =1')
points(max_acc_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_acc_ensemble], col='darkorange1', pch=19)
points(max_signifi, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_signifi], col='blue', pch=19)

The accuracy associated to each point is:

# El máximo en cada caso es
print(paste('Accuracy blue dot:', round(datos_alpha_s_1[datos_alpha_s_1$n_ensemble==max_signifi,'accuracy_mean_dataset_mean'],4)))
[1] "Accuracy blue dot: 0.8135"
print(paste('Accuracy orange dot:', round(max(datos_alpha_s_1$accuracy_mean_dataset_mean),4)))
[1] "Accuracy orange dot: 0.8149"

Let’s now make the same graph over the 40 combinations of alpha and split.

Common legend

  • Orange point: It represents the number of ensembles with the highest accuracy value
  • Blue point: It indicates the maximum number of ensembles without significant differences with higher number of ensembles
  • X axis: Number of ensembles.
  • Y axis: Accuracy (average over dataset and complexity measures and cross validation)
df_ranking <- data.frame(df_filtered$valores_combo)
colnames(df_ranking) <- 'valores_combo'
df_ranking$valor_split <- df_filtered$valor_split
df_ranking$valor_alpha <- df_filtered$valor_alpha
df_ranking$max_total <- rep(NA,dim(df_ranking)[1])
df_ranking$max_no_signif <- rep(NA,dim(df_ranking)[1])

# Configuración de la cuadrícula (5 filas y 8 columnas)
par(mfrow = c(5, 8), mar = c(2, 2, 2, 1))

max_acc_max_ensemble = 0

# Bucles para alpha y split
for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    # Filtrar los datos por alpha y split
    datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha == alpha_value, split == split_value)
    datos_alpha_s_1 <- as.data.frame(datos_alpha_s_1)
    datos_alpha_s_1$n_cycle <- as.numeric(as.character(datos_alpha_s_1$n_cycle))
    datos_alpha_s_1$n_ensemble <- as.numeric(as.character(datos_alpha_s_1$n_ensemble))
    
    # Encontrar el máximo
    idmax <- which.max(datos_alpha_s_1$accuracy_mean_dataset_mean)
    max_acc_ensemble <- datos_alpha_s_1[idmax, 'n_ensemble']
    # Guardamos para ranking
    df_ranking[(df_ranking$valor_alpha == alpha_value) & (df_ranking$valor_split == split_value),'max_total'] = max(datos_alpha_s_1$accuracy_mean_dataset_mean)

    # Cuántas veces el máximo accuracy se logra con el máximo número de modelos
    max_acc_max_ensemble = max_acc_max_ensemble + sum(max_acc_ensemble== max(datos_alpha_s_1[,'n_ensemble']))
    max_signifi <- dif_no_sig_all[(dif_no_sig_all$valor_alpha == alpha_value) & (dif_no_sig_all$valor_split == split_value), 'max_num_models']
    # Guardamos para ranking
    df_ranking[(df_ranking$valor_alpha == alpha_value) & (df_ranking$valor_split == split_value),'max_no_signif'] = max(datos_alpha_s_1[datos_alpha_s_1$n_ensemble <= max_signifi,'accuracy_mean_dataset_mean'])
    
    # Graficar
    plot(datos_alpha_s_1$n_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean, type = 'l', 
         xlab = 'n ensembles', ylab = 'accuracy mean', main = paste('alpha =', alpha_value, 'split =', split_value))
    
    # Añadir los puntos correspondientes
    points(max_acc_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_acc_ensemble], col = 'darkorange1', pch = 19)
    points(max_signifi, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_signifi], col = 'blue', pch = 19)
  }
}

# Restablecer los parámetros gráficos
par(mfrow = c(1, 1))

From the total of 40 combinations, in 17 of them the maximum accuracy is obtained at the maximum number of models tested.

The same plot but with equal y axis to enable fair comparisons.

# Configuración de la cuadrícula (5 filas y 8 columnas)
par(mfrow = c(5, 8), mar = c(2, 2, 2, 1))

# Bucles para alpha y split
for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    # Filtrar los datos por alpha y split
    datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha == alpha_value, split == split_value)
    datos_alpha_s_1 <- as.data.frame(datos_alpha_s_1)
    datos_alpha_s_1$n_cycle <- as.numeric(as.character(datos_alpha_s_1$n_cycle))
    datos_alpha_s_1$n_ensemble <- as.numeric(as.character(datos_alpha_s_1$n_ensemble))
    
    # Encontrar el máximo
    idmax <- which.max(datos_alpha_s_1$accuracy_mean_dataset_mean)
    max_acc_ensemble <- datos_alpha_s_1[idmax, 'n_ensemble']
    max_signifi <- dif_no_sig_all[(dif_no_sig_all$valor_alpha == alpha_value) & (dif_no_sig_all$valor_split == split_value), 'max_num_models']
    
    # Graficar
    plot(datos_alpha_s_1$n_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean, type = 'l', 
         xlab = 'n ensembles', ylab = 'accuracy mean', main = paste('alpha =', alpha_value, 'split =', split_value),ylim=c(0.810,0.8153))
    
    # Añadir los puntos correspondientes
    points(max_acc_ensemble, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_acc_ensemble], col = 'darkorange1', pch = 19)
    points(max_signifi, datos_alpha_s_1$accuracy_mean_dataset_mean[datos_alpha_s_1$n_ensemble == max_signifi], col = 'blue', pch = 19)
  }
}

# Restablecer los parámetros gráficos
par(mfrow = c(1, 1))

Low split values with hight alpha values is not recommended (split 1 with alpha = 6,8,10, split 2 with alpha 10). They obtain the lower accuracy performances. –> omit split = 1 for the range of recommended parameters.

For the rest of values, similar patterns are found.

We now plot all the lines in the same plot. The one obtaining clearly lower accuracy values is split = 1 and alpha = 10. The rest of values are visually super close and moving in a really close range. This indicates that any combination of the parameters is adequate.

datos_alpha_s$n_ensemble <- as.numeric(as.character(datos_alpha_s$n_ensemble))
datos_alpha_s$accuracy_mean_dataset_mean <- as.numeric(as.character(datos_alpha_s$accuracy_mean_dataset_mean))

p <- plot_ly()

for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    datos_alpha_s_1 <- datos_alpha_s %>% filter(alpha == alpha_value, split == split_value)
    p <- p %>%
      add_lines(x = datos_alpha_s_1$n_ensemble, 
                y = datos_alpha_s_1$accuracy_mean_dataset_mean, 
                name = paste("alpha =", alpha_value, "split =", split_value), 
                line = list(width = 2),
                hovertemplate = paste('Alpha: ', alpha_value, 
                                    ' Split:', alpha_value,
                                    '<br>N ensemble:', datos_alpha_s_1$n_ensemble,
                                    '<br>Accuracy:', round(datos_alpha_s_1$accuracy_mean_dataset_mean,4),
                                    '<extra></extra>'))
  }
}

p <- p %>%
  layout(title = 'All combinations of alpha and split',
         xaxis = list(title = 'n ensembles'),
         yaxis = list(title = 'accuracy mean'),
         legend = list(title = list(text = 'Legend')))

p

Let’s now obtain a ranking according to the maximum accuracy (orange point) and another according to the maximum accuracy obtain before no significant differences.

df_ranking_order <- df_ranking  %>% arrange(desc(max_total))
df_ranking_order_sig <- df_ranking  %>% arrange(desc(max_no_signif))
df_ranking$max_total_order = rank(-df_ranking$max_total)
df_ranking$max_no_signif_order = rank(-df_ranking$max_no_signif)
knitr::kable(df_ranking %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha10-split4 4 10 0.8154611 0.8143530 1 4
alpha8-split8 8 8 0.8153885 0.8145666 2 2
alpha8-split14 14 8 0.8153753 0.8142637 3 5
alpha6-split12 12 6 0.8152693 0.8147517 4 1
alpha4-split6 6 4 0.8152230 0.8140174 5 11
alpha10-split10 10 10 0.8152139 0.8140209 6 9
alpha6-split2 2 6 0.8151711 0.8139021 7 16
alpha6-split4 4 6 0.8151421 0.8137556 8 20
alpha2-split2 2 2 0.8151253 0.8126970 9 36
alpha10-split14 14 10 0.8151197 0.8138357 10 18
alpha4-split2 2 4 0.8150566 0.8132019 11 33
alpha10-split8 8 10 0.8150333 0.8138294 12 19
alpha10-split12 12 10 0.8150314 0.8141212 13 8
alpha4-split8 8 4 0.8150247 0.8136885 14 22
alpha6-split8 8 6 0.8149867 0.8139120 15 15
alpha8-split4 4 8 0.8149808 0.8136210 16 26
alpha6-split10 10 6 0.8149564 0.8138805 17 17
alpha2-split6 6 2 0.8149475 0.8140195 18 10
alpha4-split12 12 4 0.8149316 0.8133171 19 30
alpha8-split2 2 8 0.8149307 0.8126106 20 37
alpha10-split6 6 10 0.8149225 0.8139898 21 13
alpha2-split12 12 2 0.8148890 0.8136253 22 25
alpha4-split4 4 4 0.8148650 0.8135346 23 28
alpha2-split1 1 2 0.8148647 0.8135486 24 27
alpha8-split10 10 8 0.8148367 0.8142399 25 6
alpha2-split4 4 2 0.8148173 0.8136494 26 23
alpha6-split14 14 6 0.8147976 0.8145524 27 3
alpha2-split14 14 2 0.8147764 0.8133553 28 29
alpha4-split14 14 4 0.8147503 0.8140101 29 12
alpha6-split6 6 6 0.8146940 0.8136467 30 24
alpha8-split6 6 8 0.8146725 0.8141604 31 7
alpha4-split1 1 4 0.8146421 0.8128133 32 35
alpha2-split10 10 2 0.8145970 0.8139740 33 14
alpha10-split2 2 10 0.8145717 0.8132155 34 32
alpha8-split12 12 8 0.8145423 0.8137486 35 21
alpha4-split10 10 4 0.8144620 0.8132594 36 31
alpha2-split8 8 2 0.8144093 0.8131612 37 34
alpha8-split1 1 8 0.8140869 0.8122488 38 38
alpha6-split1 1 6 0.8140247 0.8118904 39 39
alpha10-split1 1 10 0.8132346 0.8108149 40 40
cor.test(df_ranking$max_total_order, df_ranking$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking$max_total_order and df_ranking$max_no_signif_order
S = 4840, p-value = 0.000334
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5459662 
cor.test(df_ranking$max_total_order, df_ranking$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking$max_total_order and df_ranking$max_no_signif_order
t = 4.0171, df = 38, p-value = 0.0002684
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.2825032 0.7328390
sample estimates:
      cor 
0.5459662 

Both rankings do not have a high correlation but they generally agree in terms of what gets the worst results (alpha high and split low). The lack of agreement on which is best can be interpreted as meaning that different combinations of parameters work well.

For now, it is not clear what is better: lots of short cycles or few of long cycles. It seems that an intermediate point between those two states is adequate.

Analysis per Complexity Measure

Now, we repeat the same analysis but for each complexity measure.

#setwd("/home/carmen/PycharmProjects/EnsemblesComplexity/Results_general_algorithm_cycles")
# Data disaggregated per complexity measures
datos_CM <- read.csv('df_summary_CM.csv') 
str(datos_CM)
'data.frame':   29880 obs. of  8 variables:
 $ weights             : chr  "CLD" "CLD" "CLD" "CLD" ...
 $ n_cycle             : int  1 1 1 1 1 1 1 1 1 1 ...
 $ n_ensemble          : int  2 2 2 2 2 2 2 2 2 2 ...
 $ alpha               : int  10 12 14 16 18 2 20 4 6 8 ...
 $ split               : int  1 1 1 1 1 1 1 1 1 1 ...
 $ accuracy_mean_mean  : num  0.772 0.767 0.771 0.767 0.766 ...
 $ accuracy_mean_median: num  0.731 0.746 0.748 0.75 0.734 ...
 $ accuracy_mean_std   : num  0.135 0.141 0.135 0.139 0.139 ...
# Como en python empezamos en 0, tenemos que sumar 1 a n_ensemble
datos_CM$n_ensemble <- datos_CM$n_ensemble + 1
# Convert id and time into factor variables
datos_CM <- datos_CM %>%
  convert_as_factor(weights, n_cycle,n_ensemble)
datos_CM_filtro <- datos_CM %>% filter(alpha<12, split <16) 
# No hace falta agregar más porque este dataset ya está agregado en origen

For the information about significant differences, we use the values obtain in general. That is, we are not repeating the multiple comparisons analyses per each complexity measure.

plot_2max_grid_with_ranking <- function(CM,df_filtered,dif_no_sig_all,datos_CM){
  df_ranking_CM <- data.frame(df_filtered$valores_combo)
colnames(df_ranking_CM) <- 'valores_combo'
df_ranking_CM$valor_split <- df_filtered$valor_split
df_ranking_CM$valor_alpha <- df_filtered$valor_alpha
df_ranking_CM$max_total <- rep(NA,dim(df_ranking_CM)[1])
df_ranking_CM$max_no_signif <- rep(NA,dim(df_ranking_CM)[1])

# Configuración de la cuadrícula (5 filas y 8 columnas)
par(mfrow = c(5, 8), mar = c(2, 2, 2, 1))

max_acc_max_ensemble = 0

# Bucles para alpha y split
for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    # Filtrar los datos por alpha y split
    datos_CM_case <- datos_CM %>% filter(weights == CM,
                                         alpha == alpha_value, split == split_value)
    datos_CM_case <- as.data.frame(datos_CM_case)
    datos_CM_case$n_cycle <- as.numeric(as.character(datos_CM_case$n_cycle))
    datos_CM_case$n_ensemble <- as.numeric(as.character(datos_CM_case$n_ensemble))
    
    # Encontrar el máximo
    idmax <- which.max(datos_CM_case$accuracy_mean_mean)
    max_acc_ensemble <- datos_CM_case[idmax, 'n_ensemble']
    # Guardamos para ranking
    df_ranking_CM[(df_ranking_CM$valor_alpha == alpha_value) & (df_ranking_CM$valor_split == split_value),'max_total'] = max(datos_CM_case$accuracy_mean_mean)
    
    # Cuántas veces el máximo accuracy se logra con el máximo número de modelos
    max_acc_max_ensemble = max_acc_max_ensemble + sum(max_acc_ensemble== max(datos_CM_case[,'n_ensemble']))
    max_signifi <- dif_no_sig_all[(dif_no_sig_all$valor_alpha == alpha_value) & (dif_no_sig_all$valor_split == split_value), 'max_num_models']
    # Guardamos para ranking
    df_ranking_CM[(df_ranking_CM$valor_alpha == alpha_value) & (df_ranking_CM$valor_split == split_value),'max_no_signif'] = max(datos_CM_case[datos_CM_case$n_ensemble <= max_signifi,'accuracy_mean_mean'])
    
    # Graficar
    plot(datos_CM_case$n_ensemble, datos_CM_case$accuracy_mean_mean, type = 'l', 
         xlab = 'n ensembles', ylab = 'accuracy mean', main = paste('alpha =', alpha_value, 'split =', split_value),ylim=c(0.805,0.818))
    
    # Añadir los puntos correspondientes
    points(max_acc_ensemble, datos_CM_case$accuracy_mean_mean[datos_CM_case$n_ensemble == max_acc_ensemble]+0.0003, col = 'darkorange1', pch = 19)
    points(max_signifi, datos_CM_case$accuracy_mean_mean[datos_CM_case$n_ensemble == max_signifi], col = 'blue', pch = 19)
  }
}

# Restablecer los parámetros gráficos
par(mfrow = c(1, 1))
return(list(df_ranking_CM = df_ranking_CM,max_acc_max_ensemble = max_acc_max_ensemble))
}
plot_all_combinations <- function(CM,datos_CM_filtro){
  datos_CM_filtro$n_ensemble <- as.numeric(as.character(datos_CM_filtro$n_ensemble))
datos_CM_filtro$accuracy_mean_mean <- as.numeric(as.character(datos_CM_filtro$accuracy_mean_mean))

p <- plot_ly()

for (alpha_value in c(2, 4, 6, 8, 10)) {
  for (split_value in c(1, 2, 4, 6, 8, 10, 12, 14)) {
    datos_CM_case <- datos_CM_filtro %>% filter(weights == CM,
      alpha == alpha_value, split == split_value)
    p <- p %>%
      add_lines(x = datos_CM_case$n_ensemble, 
                y = datos_CM_case$accuracy_mean_mean, 
                name = paste("alpha =", alpha_value, "split =", split_value), 
                line = list(width = 2),
                hovertemplate = paste('Alpha: ', alpha_value, 
                                    ' Split:', split_value,
                                    '<br>N ensemble:', datos_CM_case$n_ensemble,
                                    '<br>Accuracy:', round(datos_CM_case$accuracy_mean_mean,4),
                                    '<extra></extra>'))
  }
}

p <- p %>%
  layout(title = paste(CM,': All combinations of alpha and split'),
         xaxis = list(title = 'n ensembles'),
         yaxis = list(title = 'accuracy mean'),
         legend = list(title = list(text = 'Legend')))

p
}

CLD

CM = 'CLD'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 4 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 6522, p-value = 0.01384
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.3881801 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 2.5965, df = 38, p-value = 0.01332
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.08721722 0.62420902
sample estimates:
      cor 
0.3881801 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split2 2 2 0.8186673 0.8140829 1 29
alpha6-split8 8 6 0.8181947 0.8148745 2 19
alpha4-split2 2 4 0.8179033 0.8138234 3 32
alpha6-split4 4 6 0.8177918 0.8165264 4 3
alpha2-split1 1 2 0.8175951 0.8154003 5 13
alpha8-split2 2 8 0.8173707 0.8125193 6 35
alpha8-split8 8 8 0.8171180 0.8156546 7 11
alpha4-split10 10 4 0.8170585 0.8156799 8 10
alpha8-split14 14 8 0.8170187 0.8161492 9 5
alpha6-split12 12 6 0.8169570 0.8169570 10 1
alpha2-split10 10 2 0.8169183 0.8169183 11 2
alpha6-split2 2 6 0.8169084 0.8146636 12 21
alpha10-split8 8 10 0.8168593 0.8144187 13 25
alpha2-split12 12 2 0.8167607 0.8154363 14 12
alpha6-split10 10 6 0.8167205 0.8157030 15 9
alpha4-split6 6 4 0.8167165 0.8160603 16 6
alpha6-split6 6 6 0.8166806 0.8153664 17 14
alpha8-split12 12 8 0.8166451 0.8140446 18 30
alpha4-split1 1 4 0.8165773 0.8139498 19 31
alpha2-split14 14 2 0.8165446 0.8142291 20 28
alpha4-split4 4 4 0.8164919 0.8158095 21 8
alpha2-split8 8 2 0.8164285 0.8144007 22 26
alpha8-split10 10 8 0.8164271 0.8150485 23 17
alpha6-split14 14 6 0.8163311 0.8163311 24 4
alpha8-split4 4 8 0.8163072 0.8121492 25 38
alpha2-split6 6 2 0.8162466 0.8146132 26 23
alpha4-split12 12 4 0.8161290 0.8151108 27 15
alpha6-split1 1 6 0.8160392 0.8117883 28 39
alpha10-split14 14 10 0.8160372 0.8144986 29 24
alpha10-split2 2 10 0.8160257 0.8121663 30 37
alpha10-split12 12 10 0.8159937 0.8159937 31 7
alpha10-split10 10 10 0.8159482 0.8149080 32 18
alpha2-split4 4 2 0.8157506 0.8146230 33 22
alpha10-split6 6 10 0.8155268 0.8138097 34 33
alpha10-split4 4 10 0.8154671 0.8150541 35 16
alpha4-split14 14 4 0.8154587 0.8146836 36 20
alpha4-split8 8 4 0.8151189 0.8142452 37 27
alpha8-split1 1 8 0.8147244 0.8124780 38 36
alpha8-split6 6 8 0.8144569 0.8132987 39 34
alpha10-split1 1 10 0.8130327 0.8101046 40 40

DCP

CM = 'DCP'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 10 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 2974, p-value = 5.893e-07
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.7210131 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 6.4143, df = 38, p-value = 1.541e-07
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.5281210 0.8431492
sample estimates:
      cor 
0.7210131 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha6-split12 12 6 0.8159933 0.8149096 1 4
alpha6-split10 10 6 0.8157778 0.8132686 2 20
alpha10-split8 8 10 0.8154192 0.8143453 3 10
alpha10-split12 12 10 0.8154055 0.8144421 4 7
alpha8-split10 10 8 0.8153513 0.8144251 5 8
alpha4-split6 6 4 0.8152928 0.8137071 6 15
alpha4-split12 12 4 0.8152776 0.8129877 7 24
alpha4-split14 14 4 0.8152697 0.8152697 8 1
alpha2-split6 6 2 0.8152138 0.8149806 9 3
alpha8-split12 12 8 0.8151934 0.8151934 10 2
alpha2-split10 10 2 0.8149216 0.8128673 11 25
alpha10-split14 14 10 0.8149094 0.8144822 12 5
alpha8-split8 8 8 0.8148922 0.8138944 13 14
alpha8-split14 14 8 0.8147763 0.8121248 14 30
alpha2-split2 2 2 0.8147358 0.8124658 15 28
alpha10-split10 10 10 0.8147164 0.8133395 16 19
alpha2-split12 12 2 0.8146648 0.8144673 17 6
alpha4-split4 4 4 0.8146444 0.8130253 18 23
alpha10-split6 6 10 0.8146360 0.8144216 19 9
alpha2-split4 4 2 0.8145797 0.8142057 20 12
alpha6-split6 6 6 0.8144653 0.8131380 21 21
alpha6-split4 4 6 0.8144523 0.8120375 22 31
alpha10-split4 4 10 0.8144110 0.8135214 23 18
alpha6-split14 14 6 0.8143680 0.8140293 24 13
alpha2-split14 14 2 0.8143539 0.8119567 25 33
alpha8-split6 6 8 0.8143278 0.8143278 26 11
alpha4-split2 2 4 0.8143162 0.8127342 27 26
alpha8-split4 4 8 0.8141359 0.8136196 28 16
alpha4-split8 8 4 0.8140390 0.8136045 29 17
alpha4-split10 10 4 0.8140284 0.8126961 30 27
alpha10-split2 2 10 0.8139477 0.8111478 31 37
alpha6-split8 8 6 0.8138093 0.8122963 32 29
alpha2-split8 8 2 0.8137995 0.8119626 33 32
alpha8-split2 2 8 0.8133363 0.8105069 34 38
alpha6-split2 2 6 0.8130792 0.8130792 35 22
alpha2-split1 1 2 0.8124612 0.8115964 36 35
alpha8-split1 1 8 0.8124140 0.8118039 37 34
alpha4-split1 1 4 0.8122128 0.8104364 38 39
alpha6-split1 1 6 0.8118536 0.8114158 39 36
alpha10-split1 1 10 0.8110632 0.8096346 40 40

F1

CM = 'F1'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 11 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 7332, p-value = 0.05031
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.3121951 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 2.0258, df = 38, p-value = 0.04985
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.0007596503 0.5684242470
sample estimates:
      cor 
0.3121951 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split8 8 2 0.8162181 0.8134296 1 18
alpha2-split12 12 2 0.8161000 0.8124841 2 31
alpha10-split14 14 10 0.8158790 0.8120800 3 34
alpha4-split8 8 4 0.8158007 0.8130042 4 24
alpha2-split4 4 2 0.8157836 0.8144604 5 5
alpha8-split2 2 8 0.8156491 0.8125066 6 30
alpha4-split12 12 4 0.8155108 0.8128614 7 25
alpha2-split2 2 2 0.8154293 0.8141931 8 10
alpha4-split6 6 4 0.8153974 0.8142798 9 8
alpha6-split8 8 6 0.8153418 0.8145160 10 4
alpha6-split2 2 6 0.8153171 0.8148235 11 1
alpha2-split6 6 2 0.8153134 0.8143499 12 7
alpha10-split10 10 10 0.8152753 0.8132983 13 20
alpha10-split4 4 10 0.8152012 0.8141789 14 11
alpha8-split14 14 8 0.8151775 0.8135679 15 16
alpha2-split10 10 2 0.8151240 0.8131812 16 22
alpha4-split4 4 4 0.8151099 0.8136634 17 13
alpha2-split14 14 2 0.8150920 0.8120937 18 33
alpha10-split6 6 10 0.8150444 0.8145185 19 3
alpha8-split8 8 8 0.8149686 0.8147738 20 2
alpha4-split1 1 4 0.8149446 0.8108564 21 38
alpha10-split8 8 10 0.8149079 0.8119178 22 35
alpha10-split2 2 10 0.8146277 0.8143804 23 6
alpha8-split6 6 8 0.8145822 0.8131939 24 21
alpha6-split6 6 6 0.8144703 0.8128144 25 27
alpha6-split10 10 6 0.8143065 0.8135902 26 15
alpha6-split4 4 6 0.8142832 0.8133385 27 19
alpha4-split2 2 4 0.8142640 0.8113237 28 37
alpha4-split14 14 4 0.8142045 0.8142045 29 9
alpha8-split10 10 8 0.8141585 0.8134429 30 17
alpha8-split4 4 8 0.8141324 0.8125376 31 29
alpha10-split12 12 10 0.8140393 0.8136306 32 14
alpha8-split12 12 8 0.8139673 0.8131647 33 23
alpha6-split1 1 6 0.8139621 0.8102051 34 40
alpha6-split14 14 6 0.8139259 0.8127788 35 28
alpha6-split12 12 6 0.8139157 0.8137382 36 12
alpha2-split1 1 2 0.8138510 0.8128321 37 26
alpha4-split10 10 4 0.8137747 0.8118133 38 36
alpha8-split1 1 8 0.8136985 0.8124232 39 32
alpha10-split1 1 10 0.8135267 0.8103752 40 39

Hostility

CM = 'Hostility'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 4 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 5866, p-value = 0.003916
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.4497186 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 3.1038, df = 38, p-value = 0.003598
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.1607255 0.6676902
sample estimates:
      cor 
0.4497186 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split1 1 2 0.8183264 0.8152695 1 17
alpha10-split10 10 10 0.8179462 0.8167880 2 1
alpha6-split14 14 6 0.8177318 0.8163649 3 4
alpha6-split2 2 6 0.8174904 0.8155085 4 14
alpha2-split6 6 2 0.8173635 0.8148000 5 22
alpha4-split2 2 4 0.8173371 0.8158211 6 8
alpha8-split14 14 8 0.8172209 0.8146994 7 25
alpha10-split4 4 10 0.8171738 0.8130224 8 39
alpha10-split14 14 10 0.8170801 0.8160770 9 5
alpha8-split8 8 8 0.8169789 0.8150406 10 20
alpha2-split12 12 2 0.8169637 0.8143879 11 28
alpha4-split6 6 4 0.8169626 0.8153869 12 15
alpha8-split6 6 8 0.8169489 0.8163872 13 3
alpha10-split2 2 10 0.8169456 0.8147548 14 24
alpha6-split4 4 6 0.8169361 0.8144258 15 26
alpha10-split6 6 10 0.8168733 0.8147879 16 23
alpha6-split12 12 6 0.8167887 0.8164475 17 2
alpha10-split8 8 10 0.8167503 0.8150250 18 21
alpha10-split12 12 10 0.8167451 0.8153062 19 16
alpha8-split2 2 8 0.8166749 0.8142731 20 30
alpha4-split1 1 4 0.8166698 0.8157692 21 10
alpha8-split10 10 8 0.8165567 0.8157147 22 12
alpha8-split4 4 8 0.8165258 0.8151661 23 18
alpha4-split8 8 4 0.8163974 0.8159673 24 7
alpha2-split10 10 2 0.8162803 0.8160096 25 6
alpha2-split2 2 2 0.8159364 0.8135709 26 34
alpha8-split1 1 8 0.8158978 0.8134093 27 38
alpha6-split6 6 6 0.8158764 0.8155331 28 13
alpha2-split14 14 2 0.8158485 0.8134844 29 37
alpha8-split12 12 8 0.8158396 0.8158004 30 9
alpha4-split14 14 4 0.8158305 0.8144073 31 27
alpha6-split10 10 6 0.8158282 0.8141512 32 31
alpha4-split10 10 4 0.8158173 0.8135486 33 35
alpha4-split4 4 4 0.8157995 0.8143759 34 29
alpha2-split4 4 2 0.8157732 0.8151173 35 19
alpha6-split8 8 6 0.8157587 0.8157587 36 11
alpha4-split12 12 4 0.8151718 0.8141331 37 32
alpha6-split1 1 6 0.8150827 0.8134964 38 36
alpha2-split8 8 2 0.8150754 0.8136298 39 33
alpha10-split1 1 10 0.8147001 0.8113592 40 40

kDN

CM = 'kDN'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 9 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 5634, p-value = 0.002375
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.4714822 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 3.2957, df = 38, p-value = 0.002133
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.1875131 0.6827197
sample estimates:
      cor 
0.4714822 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha10-split2 2 10 0.8166958 0.8149255 1 7
alpha8-split6 6 8 0.8164597 0.8157555 2 2
alpha8-split8 8 8 0.8164298 0.8157762 3 1
alpha4-split2 2 4 0.8159599 0.8127375 4 34
alpha4-split8 8 4 0.8159394 0.8143590 5 10
alpha10-split8 8 10 0.8158935 0.8142164 6 13
alpha6-split1 1 6 0.8158173 0.8129876 7 29
alpha2-split4 4 2 0.8158129 0.8134779 8 22
alpha10-split4 4 10 0.8158058 0.8139595 9 16
alpha8-split10 10 8 0.8157530 0.8152683 10 4
alpha6-split8 8 6 0.8156493 0.8144016 11 9
alpha10-split12 12 10 0.8155851 0.8152713 12 3
alpha8-split2 2 8 0.8155823 0.8141484 13 14
alpha8-split4 4 8 0.8154415 0.8143571 14 11
alpha4-split6 6 4 0.8154074 0.8122357 15 38
alpha8-split14 14 8 0.8153551 0.8128341 16 31
alpha6-split2 2 6 0.8153028 0.8137750 17 20
alpha6-split12 12 6 0.8152580 0.8141094 18 15
alpha6-split10 10 6 0.8152204 0.8137833 19 19
alpha10-split14 14 10 0.8151354 0.8150855 20 5
alpha10-split6 6 10 0.8151222 0.8144494 21 8
alpha2-split1 1 2 0.8149564 0.8137297 22 21
alpha8-split12 12 8 0.8149534 0.8128829 23 30
alpha2-split10 10 2 0.8149413 0.8149413 24 6
alpha10-split10 10 10 0.8148877 0.8124655 25 36
alpha4-split1 1 4 0.8148139 0.8119671 26 39
alpha6-split6 6 6 0.8147989 0.8131325 27 25
alpha4-split4 4 4 0.8147625 0.8124454 28 37
alpha2-split14 14 2 0.8147223 0.8132137 29 24
alpha4-split10 10 4 0.8147028 0.8130496 30 27
alpha6-split4 4 6 0.8146168 0.8127233 31 35
alpha2-split2 2 2 0.8144122 0.8115326 32 40
alpha8-split1 1 8 0.8143959 0.8127543 33 33
alpha4-split14 14 4 0.8143388 0.8138410 34 18
alpha6-split14 14 6 0.8143282 0.8143282 35 12
alpha2-split8 8 2 0.8142856 0.8130378 36 28
alpha2-split6 6 2 0.8142566 0.8132373 37 23
alpha4-split12 12 4 0.8142358 0.8138505 38 17
alpha10-split1 1 10 0.8140360 0.8127738 39 32
alpha2-split12 12 2 0.8136703 0.8131120 40 26

LSC

CM = 'LSC'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 3 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 4606, p-value = 0.0001722
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5679174 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 4.2534, df = 38, p-value = 0.0001322
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3115195 0.7472327
sample estimates:
      cor 
0.5679174 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha6-split2 2 6 0.8174790 0.8155044 1 2
alpha8-split1 1 8 0.8172642 0.8135382 2 25
alpha4-split1 1 4 0.8168509 0.8140675 3 15
alpha10-split10 10 10 0.8167587 0.8151069 4 4
alpha6-split1 1 6 0.8165534 0.8121199 5 38
alpha10-split2 2 10 0.8164410 0.8147029 6 8
alpha10-split4 4 10 0.8163950 0.8146684 7 11
alpha6-split12 12 6 0.8163934 0.8158625 8 1
alpha10-split6 6 10 0.8163215 0.8154674 9 3
alpha6-split4 4 6 0.8161502 0.8143284 10 13
alpha8-split6 6 8 0.8161034 0.8149464 11 6
alpha4-split12 12 4 0.8158882 0.8128055 12 33
alpha2-split12 12 2 0.8158775 0.8146965 13 9
alpha10-split12 12 10 0.8157703 0.8136046 14 22
alpha6-split10 10 6 0.8156055 0.8146772 15 10
alpha10-split8 8 10 0.8155976 0.8149258 16 7
alpha8-split2 2 8 0.8155833 0.8133340 17 29
alpha6-split8 8 6 0.8155551 0.8133869 18 28
alpha2-split2 2 2 0.8155022 0.8135514 19 24
alpha10-split1 1 10 0.8154531 0.8127520 20 34
alpha8-split8 8 8 0.8153889 0.8150077 21 5
alpha2-split1 1 2 0.8151730 0.8137822 22 17
alpha2-split4 4 2 0.8151302 0.8133091 23 30
alpha4-split8 8 4 0.8148961 0.8137071 24 20
alpha8-split14 14 8 0.8148716 0.8137775 25 18
alpha8-split12 12 8 0.8148426 0.8140306 26 16
alpha4-split4 4 4 0.8147452 0.8142522 27 14
alpha4-split6 6 4 0.8146811 0.8129684 28 32
alpha4-split2 2 4 0.8146130 0.8145721 29 12
alpha8-split4 4 8 0.8145890 0.8136260 30 21
alpha2-split14 14 2 0.8145771 0.8137322 31 19
alpha4-split10 10 4 0.8145205 0.8135574 32 23
alpha2-split8 8 2 0.8143284 0.8117274 33 40
alpha8-split10 10 8 0.8142789 0.8134682 34 27
alpha10-split14 14 10 0.8140589 0.8126760 35 36
alpha4-split14 14 4 0.8140410 0.8126298 36 37
alpha6-split6 6 6 0.8137130 0.8119671 37 39
alpha2-split6 6 2 0.8135893 0.8134920 38 26
alpha6-split14 14 6 0.8135450 0.8132985 39 31
alpha2-split10 10 2 0.8134003 0.8126903 40 35

N1

CM = 'N1'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 7 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 4672, p-value = 0.0002085
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5617261 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 4.1854, df = 38, p-value = 0.0001623
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3032867 0.7431899
sample estimates:
      cor 
0.5617261 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha8-split14 14 8 0.8160921 0.8147946 1 3
alpha6-split4 4 6 0.8159718 0.8152506 2 1
alpha4-split6 6 4 0.8155566 0.8142342 3 8
alpha6-split6 6 6 0.8154564 0.8139963 4 9
alpha10-split1 1 10 0.8154013 0.8108402 5 39
alpha4-split1 1 4 0.8153194 0.8144388 6 4
alpha4-split14 14 4 0.8152853 0.8143108 7 7
alpha8-split1 1 8 0.8151730 0.8137670 8 14
alpha2-split10 10 2 0.8148930 0.8143960 9 5
alpha6-split14 14 6 0.8148918 0.8148918 10 2
alpha6-split2 2 6 0.8148320 0.8127019 11 28
alpha8-split6 6 8 0.8148028 0.8137782 12 13
alpha8-split2 2 8 0.8148012 0.8131357 13 19
alpha10-split2 2 10 0.8147161 0.8138692 14 12
alpha10-split8 8 10 0.8146805 0.8127239 15 27
alpha6-split1 1 6 0.8145810 0.8136500 16 15
alpha8-split4 4 8 0.8145453 0.8138717 17 11
alpha2-split2 2 2 0.8145295 0.8128737 18 25
alpha4-split2 2 4 0.8144803 0.8120003 19 36
alpha2-split1 1 2 0.8143939 0.8123416 20 34
alpha4-split4 4 4 0.8143838 0.8126720 21 29
alpha4-split8 8 4 0.8143782 0.8131206 22 20
alpha4-split10 10 4 0.8143510 0.8122434 23 35
alpha10-split4 4 10 0.8143358 0.8143203 24 6
alpha6-split8 8 6 0.8142924 0.8130314 25 23
alpha2-split6 6 2 0.8142406 0.8125992 26 32
alpha10-split6 6 10 0.8141273 0.8129221 27 24
alpha8-split8 8 8 0.8141097 0.8135144 28 16
alpha6-split10 10 6 0.8139585 0.8131115 29 21
alpha10-split10 10 10 0.8139383 0.8125715 30 33
alpha10-split14 14 10 0.8139283 0.8105000 31 40
alpha6-split12 12 6 0.8138974 0.8138974 32 10
alpha8-split12 12 8 0.8138489 0.8135008 33 17
alpha2-split14 14 2 0.8137152 0.8130780 34 22
alpha2-split8 8 2 0.8136973 0.8132054 35 18
alpha4-split12 12 4 0.8136834 0.8126641 36 30
alpha8-split10 10 8 0.8136187 0.8127684 37 26
alpha2-split4 4 2 0.8135237 0.8118722 38 37
alpha2-split12 12 2 0.8133885 0.8117790 39 38
alpha10-split12 12 10 0.8132016 0.8126359 40 31

N2

CM = 'N2'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 5 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 4378, p-value = 8.647e-05
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.5893058 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 4.4964, df = 38, p-value = 6.311e-05
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.3402593 0.7610973
sample estimates:
      cor 
0.5893058 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha10-split4 4 10 0.8176633 0.8175082 1 1
alpha8-split2 2 8 0.8174917 0.8135565 2 35
alpha6-split6 6 6 0.8174469 0.8155257 3 14
alpha4-split14 14 4 0.8173758 0.8161334 4 8
alpha10-split14 14 10 0.8173367 0.8163662 5 4
alpha4-split2 2 4 0.8171336 0.8155473 6 13
alpha6-split14 14 6 0.8171040 0.8160103 7 9
alpha2-split14 14 2 0.8169859 0.8160073 8 10
alpha2-split1 1 2 0.8169028 0.8169028 9 2
alpha10-split10 10 10 0.8168536 0.8162266 10 7
alpha8-split8 8 8 0.8168236 0.8153978 11 16
alpha8-split14 14 8 0.8167775 0.8167775 12 3
alpha2-split4 4 2 0.8167769 0.8158353 13 11
alpha10-split12 12 10 0.8167673 0.8134267 14 37
alpha6-split8 8 6 0.8166710 0.8156282 15 12
alpha6-split1 1 6 0.8166676 0.8134639 16 36
alpha6-split2 2 6 0.8166484 0.8138925 17 30
alpha4-split6 6 4 0.8164504 0.8153481 18 17
alpha8-split1 1 8 0.8163691 0.8135669 19 34
alpha8-split12 12 8 0.8162953 0.8162953 20 5
alpha10-split2 2 10 0.8162863 0.8150421 21 19
alpha6-split12 12 6 0.8162414 0.8162414 22 6
alpha4-split1 1 4 0.8162046 0.8147208 23 23
alpha8-split4 4 8 0.8161562 0.8145362 24 24
alpha8-split10 10 8 0.8160215 0.8151208 25 18
alpha2-split6 6 2 0.8159990 0.8154367 26 15
alpha4-split4 4 4 0.8159303 0.8140791 27 28
alpha6-split4 4 6 0.8159229 0.8141272 28 27
alpha4-split12 12 4 0.8158993 0.8143760 29 25
alpha10-split1 1 10 0.8158656 0.8126933 30 40
alpha6-split10 10 6 0.8158297 0.8142168 31 26
alpha2-split8 8 2 0.8157571 0.8138533 32 31
alpha4-split10 10 4 0.8157254 0.8133341 33 38
alpha10-split8 8 10 0.8156095 0.8147222 34 22
alpha4-split8 8 4 0.8154228 0.8148350 35 21
alpha10-split6 6 10 0.8154074 0.8135735 36 33
alpha8-split6 6 8 0.8153698 0.8149163 37 20
alpha2-split10 10 2 0.8153456 0.8140350 38 29
alpha2-split12 12 2 0.8153238 0.8138204 39 32
alpha2-split2 2 2 0.8153187 0.8130423 40 39

TD_U

CM = 'TD_U'
res = plot_2max_grid_with_ranking(CM,df_filtered,dif_no_sig_all,datos_CM)

df_ranking_CM = res$df_ranking_CM
max_acc_max_ensemble = res$max_acc_max_ensemble

From the total of 40 combinations, in 6 of them the maximum accuracy is obtained at the maximum number of models tested.

plot_all_combinations(CM,datos_CM_filtro)
df_ranking_CM$max_total_order = rank(-df_ranking_CM$max_total)
df_ranking_CM$max_no_signif_order = rank(-df_ranking_CM$max_no_signif)

cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('spearman'))

    Spearman's rank correlation rho

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
S = 3338, p-value = 2.179e-06
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.6868668 
cor.test(df_ranking_CM$max_total_order, df_ranking_CM$max_no_signif_order, method=c('pearson'))

    Pearson's product-moment correlation

data:  df_ranking_CM$max_total_order and df_ranking_CM$max_no_signif_order
t = 5.8259, df = 38, p-value = 9.869e-07
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.477533 0.822409
sample estimates:
      cor 
0.6868668 
knitr::kable(df_ranking_CM %>% arrange(max_total_order))
valores_combo valor_split valor_alpha max_total max_no_signif max_total_order max_no_signif_order
alpha2-split1 1 2 0.8157151 0.8136244 1 15
alpha10-split4 4 10 0.8154632 0.8142351 2 8
alpha10-split14 14 10 0.8154406 0.8148932 3 1
alpha2-split10 10 2 0.8150918 0.8137289 4 12
alpha2-split2 2 2 0.8150718 0.8127472 5 29
alpha2-split14 14 2 0.8150256 0.8136687 6 13
alpha4-split8 8 4 0.8149023 0.8130217 7 25
alpha4-split12 12 4 0.8148715 0.8133262 8 20
alpha8-split10 10 8 0.8148326 0.8148326 9 2
alpha2-split6 6 2 0.8146945 0.8145226 10 3
alpha2-split8 8 2 0.8146446 0.8140415 11 10
alpha4-split14 14 4 0.8146360 0.8144149 12 7
alpha8-split4 4 8 0.8145888 0.8141782 13 9
alpha10-split10 10 10 0.8145723 0.8124440 14 32
alpha10-split12 12 10 0.8144877 0.8144877 15 4
alpha8-split14 14 8 0.8144862 0.8144862 16 5
alpha4-split2 2 4 0.8144727 0.8133429 17 19
alpha8-split8 8 8 0.8144595 0.8136403 18 14
alpha4-split4 4 4 0.8144295 0.8144295 19 6
alpha6-split2 2 6 0.8144167 0.8128639 20 28
alpha4-split6 6 4 0.8143946 0.8135518 21 17
alpha2-split12 12 2 0.8142035 0.8129111 22 27
alpha6-split4 4 6 0.8141715 0.8138360 23 11
alpha8-split12 12 8 0.8141226 0.8117109 24 35
alpha6-split8 8 6 0.8141077 0.8130010 25 26
alpha4-split10 10 4 0.8140530 0.8135328 26 18
alpha6-split10 10 6 0.8140077 0.8131727 27 22
alpha4-split1 1 4 0.8139072 0.8122679 28 33
alpha6-split14 14 6 0.8139053 0.8136223 29 16
alpha6-split12 12 6 0.8138404 0.8118150 30 34
alpha2-split4 4 2 0.8137711 0.8127024 31 30
alpha6-split6 6 6 0.8137583 0.8131883 32 21
alpha8-split6 6 8 0.8135703 0.8124662 33 31
alpha10-split8 8 10 0.8135358 0.8130343 34 24
alpha10-split6 6 10 0.8133986 0.8131367 35 23
alpha8-split1 1 8 0.8125126 0.8095010 36 38
alpha6-split1 1 6 0.8124075 0.8100362 37 37
alpha8-split2 2 8 0.8123671 0.8114474 38 36
alpha10-split1 1 10 0.8117842 0.8078407 39 40
alpha10-split2 2 10 0.8111665 0.8091430 40 39

General conclusiones of the analysis per complexity measure

  • Worst results are found for extreme values of the parameters: split = 1 and high alphas or high alpha and split (but the first situation is worst). This is expected since we are splitting the complexity spectrum only in three pieces (with split=1: easy-base-hard) and we are multiplying the easy and the hard case for a really high value ( high alpha). Thus, instead of training with the situation easy-base-hard, we’re training with the situation super easy - base - super hard and, consequently, the complexity spectrum is not correctly covered.

  • For every complexity measure, there are different values of the parameters that offer the best solution. They all have generally in common that intermediate values of alpha and split are better. But there are some exceptions, for example alpha and split equal to 10 works well for Hostility. From this, we can conclude that intermediate length of cycles are better than too short (s=1) or too long (s=10) cycles.

  • We have a total of 40 combinations of alpha and split values. For that 40 cases, the higher accuracy is obtained at the maximum number of models tested (around 300) in 6.5 cases (on average for all the complexity measures). In particular, the maximum accuracy is obtained with the maximum number of tested models in the following cases (out of 40) for each specific complexity measure:

    • CLD: 4 times

    • DCP: 10 times

    • F1: 11 times

    • Hostility: 4 times

    • kDN: 9 times

    • LSC: 3 times

    • N1: 7 times

    • N2: 5 times

    • TDU: 6 times

It is interesting to examine whether the complexity measures for which the highest accuracy is achieved with the maximum number of models tested are those with the worst performance.

  • Regarding the correlation (both in terms of Pearson and Spearman correlation) between the ranking of alpha-split combinations according to the higher accuracy obtained when considering all the ensembles tested and when only considering those number of ensembles for which there are significant differences: in general the correlation is medium - low, indicating that there is no clear agreement between those significant differences and the real maximum. Anyway, we have to compare what are the real differences in terms of if it is really worthy to keep training more and more models for maybe an increase of 0.001 in accuracy (differences are not significant and all graphs have shown that the accuracy is moving always in a close interval of values). In particular:

    • CLD: corr 0.388

    • DCP: corr = 0.72

    • F1: corr = 0.31

    • Hostility: corr = 0.449

    • kDN: corr = 0.471

    • LSC: corr = 0.5679

    • N1: corr = 0.561

    • N2: corr = 0.589

    • TDU: corr = 0.687

      • To deal with this, we are going to compare our method with SOTA methods in two different situations:

        • Considering all the ensembles tested

        • Considering only those where there are significant differences